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1. INTRODUCTION 

Role-Based Access Control (RBAC) has demonstrated to be a proper access control model to 
manage authorizations aspect especially for the security administration due to its flexibility and ability to 
capture an organization’s structure and objectives. According to NIST [1], RBAC is accepted as a security 
standard in numerous domains, such as Industrial, Military and Healthcare. Role engineering has been 
introduced by [2] and has been applied to define a requisite and correct set of roles and permissions and role 
mining is a concept in the role engineering that popular among the researchers due to the nature of applying 
computing-intensive approaches that could decrease the cost of maintaining the security features and also 
simplifies the work of security administrators. 

The objectives of this paper are firstly, to analyze and classify on some of the present role mining 
algorithms from 2013 to 2017 and then provide a general overview on phases or stages that involved in 
designing and developing them and secondly, to propose a conceptual model that constructed based on the 
analysis of the aforementioned phases. 

The remainder of the paper is structured as follows. The analysis of the background study is shown 
in Section 2. Section 3 presents a mathematical background of this study while Section 4 introduces the 
general process in role mining model. Lastly, Section 5 discusses the conclusions and the future work that 
can lead to further enhancement of this field. 


2. LITERATURE REVIEW 
Numerous studies have reported that presently, role-based access control (RBAC) has becoming the 
predominant access control model because of it principle that could significantly simplifies the work of 
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security administrators [3-6]. The abovementioned principle of RBAC could be defined as every role is a 
group of permissions, and each user obtains the permissions only through the roles. 

According to Ye et al. [4], RBAC system could be implemented through two approaches 
specifically the top-down and the bottom-up method. The authors have explained that the top-down approach 
builds a RBAC system with the involvement of experts’ analysis on the business processes yet, this approach 
consumes a lot of time because of human participation [7]. The bottom-up approach, according to Hu et al. 
[8] can uncover roles from the existing user-permission assignments (UPA) automatically that is known as 
role mining and because of its nature that based on computing-intensive approach, it is widely applied to 
build a RBAC model. 

However, to build and sustain a RBAC model, role mining is becoming a great interest [9-10] and 
the authors have identified the need of role mining to design and develop an algorithm to determine roles 
based on data mining methods because it could reduce the cost of allocating roles manually thus able to 
construct a concise RBAC system. The next section would provide in-depth analysis on methodology to build 
a RBAC model using role mining algorithm. 


2.1. Role Mining Model 
In general, Fuchs and Meier [11] has introduced a general Role Mining Process Model as in 
Figure | and the author has described the phases in Figure | as the following: 
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Figure |. Role mining process model 


2.1.1 Input Data 

In most of the role mining algorithms, user-permission assignment (UPA) matrix can be considered 
as input data, correspondingly, an active research should be done to explore the possibility of other types of 
data to be used in role mining process. 


2.1.2 Pre-processing 

Many researchers have emphasized on the importance of this stage [12-13] particularly to generate a 
clean and quality data and usually pre-processing stage involve the process to clean the noises that might 
affect the results. 


2.1.3 Role Detection 

This stage is significant in the role mining process model because it involves the discovery of 
appropriate candidate roles from obtainable set of inputs preferably a clean data using data mining techniques 
or heuristics algorithms called as role mining algorithm. 


2.1.4 Post-processing 
In this stage, the acquired candidate roles from the previous stage are being selected and assigned 
optimally by using yet other suitable algorithms. 


2.1.5 Ouput Data 
The output of these processes is normally a set of roles and a RBAC state such as hierarchy or 
involving constraints. 


2.2. Role Mining Algorithm Phases 

Numerous studies have proposed role mining algorithms that could solve Role Mining Problem 
(RMP) in Role Based Access Control (RBAC) system and this following section would analyze and classify 
on some of the present role mining algorithms from 2013 to 2017 according to the phases that stated in 2.1 
section and the detail is presented in Table 1. 
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No 


Table 1. Role mining algorithm phases 


Authors (Year) & Title 


Candidate Roles (CR) Phase and 
Role Selection (RS) & Assignment 
Phase (AP) 
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Input & Output Data 





10. 


11. 


14] 
An Efficiency Approach for RBAC 
Reconfiguration with Minimal 
Roles and Perturbation 


[16] 
An Approach for Hierarchical 
RBAC Reconfiguration with 
Minimal Perturbation 


[10] 
The RBAC System Based on Role 
Risk and User Trust 


[17] 
Mutual Exclusion Role Constraint 
Mining Based on Weight in Role- 
Based Access Control System 


[18] 
Scott D. Stoller and Thang Bui 
(2016) 

Mining Hierarchical Temporal 
Roles with Multiple Metrics 
[19] 

Mining Approximate Roles under 
Important Assignment 


[20] 
Mining Temporal Roles using 
Many-Valued Concepts 


[21] 
Towards User-oriented RBAC 
Model 


[22] 
The Generalized Temporal Role 
Mining Problem 


[23] 
Role Mining Based on Permission 
Cardinality Constraint and User 
Cardinality Constraint 


[7] 
Role Mining based on Cardinality 
Constraints 


CR: generate the candidate roles 
using FastMiner algorithm [15] 
RS & AP: select a subset from CR 
that minimizes the perturbation 


CR: generate the candidate roles 
using FastMiner algorithm [15] 
RS & AP: select the similar roles 
which is the most similar the set of 
qualified roles 


CR: clustering roles based on risk 
RS & AP: for each role identify the 
permission (P) > assigned trust 
threshold 


CR: generate the candidate 
permission sets based on weight 
RS & AP: generate all 
combinations of permission sets 
whose weighted support is greater 
than the user specified minimum 
weighted 
CR: generates initial roles and then 
creates additional candidate roles 
by intersecting sets of initial roles 
using FastMiner [15] 

RS & AP: Construct role hierarchy 
CR: UPA is decomposed into two 
assignments, NUPA & IUPA 
RS & AP: (1) NUPA is processed 
by the 6 -Approx RM algorithm 
and generates NRoles (2) IUPA is 
processed by any algorithm to 
generate [Roles 
CR: construct the concepts only for 
a pre-determined value of 0 (many- 
value concept). 

RS & AP: maximum area of 
coverage is selected and is added to 
the final set of concepts 
CR: three different ways of 
generating candidate roles (1) itself 
(2) intersection (3) association 
RS & AP: enforce maximal role 
assignment constraints (1) greedy 
(2) fewest (3) most (4) rand 
CR: creation of candidate role set 
by taking union of the sets of units, 
initial and generated roles 
RS & AP: iterative selection of a 
minimal cardinality subset of the 
candidate role set using any one of 
the four greedy heuristics 
CR: initial role: (1) one is from the 
prerequisite role set (2) the initial 
role set generation algorithm 
RS & AP: role selection algorithm 
& role state generation algorithm 
CR: generating the initial role set 
based cardinality constraints of 
roles and permissions 
RS & AP: selecting role pair for 
role update algorithm (hierarchical 
relationships) & updating the initial 
role state (graph optimization 
algorithm) 


Input: access history log & UPA 
matrix 
Output: GR as the generated roles 
and QR as the qualified roles 


Input: access history log & UPA 
matrix 

Output: hierarchical RBAC state 
(RH) 


Phase 1: clustering roles based on 
risk 
Input: UPA matrix 
Output: RBAC roles & RH stable 
Phase 2: trust 
Input: RBAC roles 
Output: roles 
Input: UPA matrix 
Output: roles 


Input: ACL policy 
Output: role hierarchy 


Input: UPA matrix 
Output: roles 


Input: TUPA matrix 
Output: UA, PA & REB 


Input: UPA matrix 
Output: UA & PA 


Input: TUPA matrix 
Output: R, UA, PA & REB 


Input: UPA matrix 
Output: UA & PA 


Input: UPA matrix 
Output: roles 
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3. 





12. 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 


[24] 
Migrating from DAC to RBAC 


[5] 
Handling Least Privilege Problem 
and Role Mining in RBAC 


[25] 
Performance of AI Algorithms for 
Mining Meaningful Roles 


[26] 
An Optimization Framework for 
Role Mining 


[27] 
Visual Elicitation of Roles: using A 
Hybrid Approach 
[28] 
Toward Mining of Temporal Roles 


[3] 
Role Mining Using Boolean Matrix 
Decomposition with Hierarchy 


[29] 
Mining Parameterized Role-based 
Policies 


[30] 
Evolving role definitions through 
permission 
invocation patterns 


CR: each iteration uses the 
DEMiner algorithm to generates a 
candidate role 
RS & AP: at each iteration, the new 
candidate role is intersected with 
the roles in R and UPA are 
performed at each iteration to 
reflect the updates in R. 

CR: (1) The first kind of candidate 
roles set (FCR) 

(2) The second kind of candidate 
roles set (SCR) 

RS & AP: least privileges principle 
CR: generating roles using 
elimination algorithm 
RS & AP: in each generation, the 
elitist selection scheme is applied 
to guarantee that the fittest member 
of each generation is copied 
directly into the next generation 
(GA) 

CR: three different ways of 
generating candidate roles (1) itself 
(2) intersection (3) association 
RS & AP: greedy algorithm 
CR: Random Data Generator 
(RDG) 

RS & AP: Matrix sorting algorithm 
CR: enumerates the set of 
candidate roles from an input 
TUPA matrix 
Role Selection and assignment: 
elects the least possible number of 
roles from the candidate roles using 
a greedy heuristic 
CR: the candidate roles through 
formal concept analysis 
RS & AP: redundant roles can be 
removed according to cost-utility 
analysis 
CR: use CompleteMiner [15] to 
generate candidate roles. 

RS & AP: (1) It selects roles from 
highest quality to lowest (2) 
Compute Role Hierarchy (3) 
CR: which are selected to optimize 
an objective function that balances 
distance from the original roles 
with behaviorial similarity in the 
form of permission 
RS & AP: assigned to roles 
according to a criterion that 
mitigates redundancy 


MATHEMATICAL BACKGROUND 


Input: UPA matrix 
Output: UA & roles 


Input: a set of users, a set of 

privileges, and a set of user- 

privilege assignment relation 
Output: roles 


Input: RH 
Output: roles 


Input: UPA 
Output: UA & PA 


Input: UPA 
Output: role sets 


Input: TUPA matrix 
Output: UA, PA & REB 


Input: UPA matrix 
Output: UA, PA, RH, UA’ & PA’ 
matrix 


Input: UPA matrix 
Output: UA, PA & RH 


Input: access history log 
Output: roles 


This section presents some of the formal definitions that related to the Role Based Access Control 


(RBAC) as well as Role Mining Problem (RMP) and its variants and some of terms are associated to the 
conceptual model. 


Definition 1. (RBAC Model) 


a) 
b) 
c) 
d) 
e) 


The RBAC model has the following basic elements [14]: 
U, R and P are signifying the set of users, roles and permissions. ‘s 


roca 
tbo 
pen 


tba 


UA € U x Ris representing the user-role assignments. ‘ste’ 
PA € Px R is defining the role-permission assignments. 
UPAGSU x P is the user-permission assignments. 

RH CR xR, a partial order on roles described the inheritance relationships. 
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Definition 2. (RBAC state) 

RBAC state could be expressed as <R, UA, PA, RH> that is consistent with an access control 
configuration p =< U, P, UPA >, where U defines a set of all users, P is a set of all permissions and UPA € U 
x P as the user-permission relation [10]. 


Definition 3. (Basic RMP) 

Assumed a set of users (U), a set of permissions (P) and a user-permission assignment (UPA) is 
given, acquire a set of roles (R), a user-role assignment (UA) and a role-permission assignment (PA) while 
reducing the number of roles IRI = k. In matrix notation, it can be formulated as [12]: 


| UA@PA — UPAI|1 =0 (1) 


Definition 4. (6-Approx RMP) 

6-approx RMP is a part of Basic-RMP that allows a partial match between the user-permission 
assignment (UPA) and the generated user-role assignment (UA) and a role-permission assignment (PA) and 
can occasionally decrease the total number of roles, k, substantially. It can be formulated in matrix 
representation, such that [31]: 


|UA® PA — UPA]]1 <3 (2) 


Definition 5. (MinNoise RMP) 

For MinNoise RMP, the number of roles (k) is bounded so that the number of mismatches between 
the UPA and the generated UPA is minimized. So, presumed a set of users (U), a set of permissions (P), a 
user-permission assignment (UPA) and a number of roles (k) is given, discover a set of k roles (R), a user- 
role assignment UA and a role-permission assignment PA by minimizing [31]: 


|| UA@PA — UPA IIL (3) 


Definition 6. (User-Permission Assignment) 

The user-permission assignment (UPA) matrix is an m x n binary matrix UPA, m is representing the 
number of users, while n can be defined as the number of permissions. The element UPA (i, j) = 1 indicates 
the assignment of permission j to user i [14]. 


Definition 7. (Access History Log) 

Access history log is a series of quaternion (U, P, R, t) and this series indicates an access event in 
the system and represents the user (U) invocate the permission (P) by activating the role (R) at the 
time t [16]. 


4. CONCEPTUAL MODEL 

A conceptual model determines a comprehensive understanding and scopes of a proposed solution 
using the organized concepts that are linked together [32] and for this paper, the conceptual model is 
constructed based on literature review in Section 2 and this model represents a general process in role mining 
model as shown in Figure 2. 
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Figure 2. General process in role mining model 


The components of the model and the relationships between them are described as follows. The 
process starts by inserting input data and as listed in Table 1, most of the role mining algorithms are utilizing 
variants of user-permission assignment (UPA) matrix as an input data and some of the researchers also using 
access history log. Then the data is transferred to pre-processing stage and the general activity in this stage 
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includes the data cleaning and data normalization specifically the acts of removing noise and handling 
missing data and the purpose of this stage is to create a set of correct data for the next process. 

Next, the clean data is advanced to the next stage namely candidate roles generation phase. This 
stage is the most meaningful process because it involves the discovery of appropriate candidate roles by 
exploiting any suitable data mining techniques or heuristics algorithms or known as role mining algorithms. 
This stage usually produce a big pool of candidate roles, therefore in the next step, role selection and role 
assignment phase, more specific and smaller roles are produced according to the desired objectives and 
outputs by utilizing any appropriate data or role mining algorithms. Lastly, as mentioned before and based on 
information in Table 1, the outputs could be varied but most of the researchers prefer number of roles as the 
main output. 


5. CONCLUSION AND FUTURE WORKS 

We have proposed a conceptual model that is constructed based on the literature review and this 
model represents a general process in role mining model. This model involves series of phases that begin 
with the input of data, pre-processing stage, candidate role generation phase, role selection and role 
assignment process and lastly number of roles as generated output. For the future works, we intend to 
improve the conceptual model with a comprehensive and complete model. 
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